Active learning for clinical text classification: is it better than random sampling?
نویسندگان
چکیده
منابع مشابه
Active learning for clinical text classification: is it better than random sampling?
OBJECTIVE This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. DESIGN Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of pass...
متن کاملWhy is Posterior Sampling Better than Optimism for Reinforcement Learning?
Computational results demonstrate that posterior sampling for reinforcement learning (PSRL) dramatically outperforms existing algorithms driven by optimism, such as UCRL2. We provide insight into the extent of this performance boost and the phenomenon that drives it. We leverage this insight to establish an ̃ O(H p SAT ) Bayesian regret bound for PSRL in finite-horizon episodic Markov decision ...
متن کاملActive Learning with Rationales for Text Classification
We present a simple and yet effective approach that can incorporate rationales elicited from annotators into the training of any offthe-shelf classifier. We show that our simple approach is effective for multinomial naı̈ve Bayes, logistic regression, and support vector machines. We additionally present an active learning method tailored specifically for the learning with rationales framework.
متن کاملPool-Based Active Learning for Text Classification
This paper shows how a text classifier’s need for labeled training documents can be reduced by employing a large pool of unlabeled documents. We modify the Query-by-Committee (QBC) method of active learning to use the unlabeled pool by explicitly estimating document density when selecting examples for labeling. Then active learning is combined with Expectation-Maximization in order to “fill in”...
متن کاملActive learning for text classification with reusability
Where active learning with uncertainty sampling is used to generate training sets for classification applications, it is sensible to use the same type of classifier to select the most informative training examples as the type of classifier that will be used in the final classification application. There are scenarios, however, where this might not be possible, for example due to computational c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Medical Informatics Association
سال: 2012
ISSN: 1067-5027,1527-974X
DOI: 10.1136/amiajnl-2011-000648